Rafael S. de Souza
4/17/2017
R integrates data manipulation, graphics and extensive statistical analysis. Uniform documentation and coding standards. But quality control is limited.
Easy download from http://www.r-project.org for Windows, Mac or linux. On-the-fly installation of CRAN packages.
\(>\) 10500 user-provided add-on CRAN packages, tens of thousands of statistical functions
Principal difficulty: Finding what you want, and understanding what you find. Google helps the former problem. Improved education in statistics addresses the latter problem.
| Bayesian Inference | Machine Learning | Social Sciences |
| Computational Physics | Medical Image Analysis | Spatial Data |
| Cluster Analysis | Multivariate Statistics | Statistical Geneticss |
| Differential Equations | Natural Language | Survival Analysi |
| Econometrics | Numerical Mathematic | Time Series Analysis |
| Environmetrics | Optimization | Visualization |
| Environmetrics | Pharmacokinetic | Web Technologies |
| Extreme Value Analysis | Phylogenetics | |
| Empirical Finance | Probability Distributions | |
| Functional Data Analysis | Psychometric |
require(ggplot2);
require(reshape2);require(d3heatmap);require(circlize);require(ggdendro)1+1## [1] 2
x <- 2
for (i in 1:5){
print(x+i)
}## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
x <- rnorm(100)
hist(x)set.seed(1056) # set seed to replicate example
nobs= 150 # number of obs in model
x1 <- runif(nobs,0,5) # random uniform variable
mu <- 1 + 5 * x1 - 0.75 * x1 ^ 2 # linear predictor, xb
y <- rnorm(nobs, mu, sd=0.5) # create y as adjusted random normal variate
fit <- lm(y ~ x1+I(x1^2)) # Normal Fit summary(fit)| Â | Estimate | Std. Error | t value | Pr(>|t|) |
|---|---|---|---|---|
| x1 | 4.867 | 0.1057 | 46.05 | 3.086e-89 |
| I(x1^2) | -0.7204 | 0.02074 | -34.74 | 9.438e-73 |
| (Intercept) | 1.118 | 0.1111 | 10.06 | 1.92e-18 |
xx <- seq(0,5,length=200)
ypred <- predict(fit,newdata=list(x1=xx),type="response") # Prediction from the model
plot(x1,y,pch=19,col="red") # Plot regression line
lines(xx,ypred,col='cyan',lwd=4,lty=2)
segments(x1,fitted(fit),x1,y,lwd=2,col="gray") # add the residualsRead and display data in table format
d <- read.csv("exoplanets.csv",header = T)
d <- d[complete.cases(d),]
head(d)nc <- cor(d[,c(2,3,5,6,7)])
chordDiagram(nc)